Mon. Not. R. Astron. Soc. 000, 000-000 (0000) 



Printed 6 March 2013 



(MN MfeX style file v2.2) 



Galaxy and Mass Assembly (GAM A): Colour and 
luminosity dependent clustering from calibrated 
photometric redshifts 

L. Christodoulou 1 *, C. Eminian 1 , J. Loveday 1 , P. Norberg 2 , I.K. Baldry 3 , 

P.D. Hurley 1 , S.P. Driver 4 ' 5 , S.P. Bamford 6 , A.M. Hopkins 7 , J. Liske 8 , 

J. A. Peacock 9 , J. Bland-Hawthorn 10 , S. Brough 7 , E. Cameron 11 , C.J. Conselice, 6 

S.M. Croom 10 , C.S. Frenk 2 , M. Gunawardhana 10 , D.H. Jones 12 , L.S. Kelvin 4 ' 5 , 

K. Kuijken 13 , R.C. Nichol 14 , H. Parkinson 9 , K.A. Pimbblet 12 , C.C. Popescu 15 , 

M. Prescott 3 , A.S.G. Robotham 4 ' 5 , R.C Sharp 16 , W.J. Sutherland 17 , 

E.N. Taylor 18 , D. Thomas 14 , R.J. Tuffs 19 , E. van Kampen 8 , D. Wijesinghe 10 

1 Astronomy Centre, University of Sussex, Falmer, Brighton BN1 9QH, UK 

2 Institute for Computational Cosmology, Department of Physics, Durham University, South Road, Durham DH1 3LE, UK 

3 Astrophysics Research Institute, Liverpool John Moores University, 4 ICRAR (International Centre for Radio Astronomy Research), 
University of Western Australia, Crawley, WA6009, Australia 

5 SUPA (Scottish Universities Physics Alliance), School of Physics & Astronomy, University of St Andrews, North Haugh, St Andrews, 
Fife, KY169SS, UK 

Twelve Quays House, Egerton Wharf, Birkenhead, CH41 1LD, UK 

6 Centre for Astronomy and Particle Theory, University of Nottingham, University Park, Nottingham NG7 2RD, UK 

7 Australian Astronomical Observatory, P.O. Box 296, Epping, NSW 1710, Australia 

8 European Southern Observatory, Karl-Schwarzschild-Str. 2, 85748 Garching, Germany 

9 Institute for Astronomy, University of Edinburgh, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ, Scotland 

10 Sydney Institute for Astronomy, School of Physics, University of Sydney, NSW 2006, Australia 

11 Department of Physics, Swiss Federal Institute of Technology (ETH-Ziirich), 8093 Zurich, Switzerland 

12 School of Physics, Monash University, Clayton, Victoria 3800, Australia 

13 Leiden University, P. O. Box 9500, 2300 RA Leiden, The Netherlands 

14 Institute of Cosmology and Gravitation (ICG), University of Portsmouth, Dennis Sciama Building, Burnaby Road, Portsmouth POl 
3FX, UK 

1 15 Jeremiah Horrocks Institute, University of Central Lancashire, Preston PRl 2HE, UK 

16 Research School of Astronomy & Astrophysics, Mount Stromlo Observatory, Weston Creek, ACT 2611, Australia 

17 Astronomy Unit, Queen Mary University London, Mile End Rd, London El 4NS, UK 

18 School of Physics, University of Melbourne, Victoria 3010, Australia 

19 Max Planck Institute for Nuclear Physics (MPIK), Saupfercheckweg 1, 69117 Heidelberg, Germany 



6 March 2013 



ABSTRACT 

We measure the two-point angular correlation function of a sample of 4,289,223 galax- 
ies with r < 19.4 mag from the Sloan Digital Sky Survey as a function of photometric 
redshift, absolute magnitude and colour down to M r — 5\ogh = — 14 mag. Photo- 
metric redshifts are estimated from ugriz model magnitudes and two Petrosian radii 
using the artificial neural network package ANNz, taking advantage of the Galaxy and 
Mass Assembly (GAMA) spectroscopic sample as our training set. The photometric 
redshifts are then used to determine absolute magnitudes and colours. For all our sam- 
ples, we estimate the underlying redshift and absolute magnitude distributions using 
Monte-Carlo resampling. These redshift distributions are used in Limber's equation 
to obtain spatial correlation function parameters from power law fits to the angu- 
lar correlation function. We confirm an increase in clustering strength for sub-L* red 
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galaxies compared with ~ L* red galaxies at small scales in all redshift bins, whereas 
for the blue population the correlation length is almost independent of luminosity for 
~ L* galaxies and fainter. A linear relation between relative bias and log luminosity is 
found to hold down to luminosities L ~ 0.03L*. We find that the redshift dependence 
of the bias of the L * popu lation can be described by the passive evolution model of 
iTegmark fc Peebles! (|1998f ). A visual inspection of a random sample of our r < 19.4 
sample of SDSS galaxies reveals that about 10 per cent are spurious, with a higher con- 
tamination rate towards very faint absolute magnitudes due to over-deblended nearby 
galaxies. We correct for this contamination in our clustering analysis. 

Key words: galaxies: clustering, photometric redshift, faint population 



1 INTRODUCTION 

Measurement of galaxy clustering is an important cosmo- 
logical tool in understanding the formation and evolution of 
galaxies at different epochs. The dependence of galaxy clus- 
tering on properties such as morphology, colour, luminosity 
or spectral type has been established over many decades. 
Elliptical galaxies or galaxies with red colours, which both 
trace an old stellar population, are known to be more 
clustered than spiral galaxies (e.g. [ Davis fc Gellei 
DressleJll980l:IPostman fc GelleJl 19841: lLovedav et all 



Guzzo et all Il997l : 



1976 



1995 



Goto et al.l l2003l ) . Recent large galaxy 



surveys have allowed the investigation of ga laxy clustering 



as a f unction of both colou r and luminosity jNorberg et al 



2002) : iBudavari et~ai1 120031: IZehavi et al.ll2005l: IWang et al 



20071 : iMcCracken et all 120081 : IZehavi et all l201lT ). Among 
the red population, a strong luminosity dependence has been 
observed whereby luminous galaxies are more clustered, be- 
cause they reside in denser environments. 

The galaxy luminosity function shows an increasing 
faint- end density to at least as faint as M r — 5 log ft — — 12 
mag l|Blanton et al.ll2005al : lLovedav et al.ll2012h . thus intrin- 
sically faint galaxies represent the majority of the galaxies 
in the universe. These galaxies with luminosity L <C L* 
have low stellar mass and are mostly dwarf galaxies with 
ongoing star formation. However, because most wide-field 
spectroscopic surveys can only probe luminous galaxies over 
large volumes, this population is often under-represented. 
Previous clustering analyses have revealed that intrinsi- 
cally faint galaxies have different properties to luminous 
ones. A striking difference appears between galaxy colours 
in this regime: while faint blue galaxies seem to cluster 
on a scale almost independent of luminosity, the faint red 
opulation is shown to be very sensitive to luminosity 



popul ation is snown to be very sensitive tc 
jNorberg et all l200ll. |2002l; IZehavi et all 12002 



2003 




2011 





Zchavi et al. 2005; Swanson et al 



2008a 



Hog g et al 



Zehavi et al 



Ross et al.ll2011bf ). As found bv lZehavi et all (|2005h . 



this trend is naturally explained by the halo occupation dis- 
tribution framework. In this picture, the faint red population 
corresponds to red satellite galaxies, which are located in 
high mass halos with red cen tral galaxies and ar e therefore 
strongly clustered. Recently, iRoss et all |2011tJ ) compiled 
from the literature bias measurements for red galaxies over 
a wide range of luminosities for both spectroscopic and pho- 
tometric data. They showed that the bias measurements of 
the faint red population are strongly affected by non-linear 
effects and thus on the physical scales over which they are 
measured. They conclude that red galaxies with M r > — 19 



mag are similarly or less biased than red galaxies of inter- 
mediate luminosity. 

In this work, we make use of photometric redshifts to 
probe the regime of intrinsically faint galaxies. Our sam- 
ple is composed of SDSS galaxies with r-band Petrosian 
magnitude r pct ro < 19.4. As we have an ideal training set 
for th is sample, thanks to the GAMA survey (|Driver et al.l 
IgOlllh we use the artific ial neural network package ANNz 
i Collister fc Lahavl 12004) to predict photometric redshifts. 
We then calculate the angular two-point correlation function 
as a function of absolute magnitude and colour. The corre- 
lation length of each sample is computed through the in- 
version of Limber's equation, using Monte-Carlo resampling 
for modelling the u nderlying redshift distribution. Recently, 
IZehavi et al l (|201lh presented the clustering properties of 
the DR7 spectroscopic sample of SDSS. They extracted a 
sample of ~ 700,000 galaxies with redshifts to r ^ 17.6 
mag, covering an area of 8000 deg 2 . Their study of the lu- 
minosity and colour dependence uses power law fits to the 
projected correlation function. Our study is complementary 
to theirs, since we are using calibrated photo-zs of fainter 
galaxies from the same SDSS imaging catalogue. We use 
similar luminosity bins to Zehavi et al., with the addition of 
a fainter luminosity bin — 17 < M r — 5 log h < —14. 

Small-scale (r < 0.1ft. - Mpc) galaxy clustering provides 
additional tests of the fundamental problem of how galaxies 
trace dark matter. Previous studies have used SDSS data 
and the projected correlation function to stu dy the cluster- 
ing o f galaxies at the smallest scales possible (|Masiedi et al.l 
2006), using extensive modeling to account for the fibre con- 
straint in SDSS spectroscopic data. The interpretation of 
these results offers unique tests about how galaxies trace 
dark matter and th e inner structure of dark matter halos 
jWatson et al.ll201lh . Motivated by these studies we present 
measurements of the angular correlation function down to 
scales of 9 ~ 0.005 degrees. We work solely with the angu- 
lar correlation function and we pay particular attention to 
systematics errors and the quality of the data. 

On the other hand, on sufficiently large scales (r > 
60 ft -1 Mpc), it is expected that the galaxy density field 
evolves linea rly following the evol ution of the dark matter 
density field (|Tegmark et al.| [2006). However, it is less clear 
if this assumption holds on smaller scales, where compli- 
cated physics of galaxy formation and evolution dominate. 
In the absence of sufficient spectroscopic data to comprehen - 
sively study the evolution of clustering, I Ross et all (|2010f ) 
used SDSS photometric redshifts to extract a volume- limited 
sample with M r < —21.2 and 2 p hot < 0.4. Their analy- 
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sis revea led significant deviat i ons fr om the passive evolution 
model of lTegmark fc Peebles! l|l998l ). Here we perform a sim- 
ilar analysis, again using photometric redshifts, for the L* 
population. 

This paper is organised as follows. In Section[2] we intro- 
duce the statistical quantities to calculate the clustering of 
galaxies, with an emphasis on the angular correlation func- 
tion. In Section[3]we present our data for this study and the 
method for estimating the clustering errors. In Section [4] we 
describe the procedure that we followed in order to obtain 
the photometric redshifts. We then investigate the cluster- 
ing of our photometric sample, containing a large number 
of intrinsically faint galaxies, in Section [5] In Section [6] we 
present bias measurements as functions of colour, luminosity 
and redshift. Our findings are summarised in Section [7] In 
Appendix [X] we show how we extracted our initial catalogue 
from the SDSS DR7 database and finally in Appendix iBl we 
describe in some detail the tests performed to assess system- 
atic errors. 

Throughout we assume a standard flat ACDM cosmol- 
ogy, with fl m = 0.30, A = 0.70 and H = 100ft km s" 1 
Mpc" 1 . 



2 THE TWO-POINT ANGULAR 
CORRELATION FUNCTION 

2.1 Definition 

The simplest way to measure galaxy clustering on the sky 
is via the two-point correlation function, w(8), which gives 
the excess probability of finding two galaxies at an angu- 
lar separation 9 compared to a random Poisson distribution 
l|Peebleslll980l § 31): 



[1 + w(6)]dn 1 dQ 2 



(1) 



where dP is the joint probability of finding galaxies in solid 
angles dfii and dQ.2 separated by 9, and n is the mean num- 
ber of objects per solid angle. If w(9) = 0, then the galaxies 
are unclustered and randomly distributed at this separation. 
We consider various estimators for w(9) in Section [2.31 



2.2 Power law approximation 

Over small angular separations, the two-point correlation 
function can be approximated by a power law: 



w(0) = A n 



3 i — i 



(2) 



where A m is the amplitude. The amplitude of the correla- 
tion function of a galaxy population is reduced as we go 
to higher redshifts, because equal angular separations trace 
larger spatial separations for more distant objects. By con- 
trast, the slope 1 — 7, of the correlation function is observed 
to vary little from sample to sample, with 7 ~ 1.8. It is 
mostly sensitive to galaxy colours (see Section [5]). 



2.3 Estimator 

In practice, the calculation of w(9) is done through the nor- 
malised counts of galaxy-galaxy pairs DD(9) from the data, 



random-random pairs RR(8) from an unclustered random 
catalogue which follows the survey angular selection func- 
tion, and galaxy- random pairs DR(0). Various expressions 
have been used to calculate w(9). In this work w e adopt 
the estimator introduced bv lLandv fc Szalavl l|l993h . which 
is widely used in the literature: 



w{0) 



DD(9) - 2DR(9) + RR{9) 



(3) 



RR{6) 

lLandv fc Szalavl §993) showed that this estimator has a 
small variance, close to Poisson, and allows one to measure 
correlation functions with minimal uncertainty and bias. 
The counts DD(9), DR(6) and RR(9) have to be normalised 
to allow for different total numbers of galaxies n g and ran- 
dom points n r : 



DD(9) 
DR{9) 
RR{9) 



N gg (6) 



Ml, 

jMg) 

n r (n r — 1) /2 



We use approximately ten times as many random points 
as galaxies in order that the results do not depend on a 
particular realization of random dis tribution. We als o tried 
an alternative estimator proposed bv lHamiltorJl|l993h which 
revealed no significant changes in the correlation function 
measurements. 

Estimates of the angular correlation function are af- 
fected by an integral constraint of the form 



1 

fi2 



w(9 12 )dn 1 dn 2 = 0, 



(4) 



where the integral is over all pairs of elements of solid an- 
gle n, within the survey area. The constraint requires that 
w(9) goes negative at large separations, to balance the pos- 
itive clustering signal at smaller separations. However, for 
wide-field surveys like SDSS the integral constraint has a 
negligible effect on w(9), even on large scales. We find that 
the additive correction for the integral constraint is at least 
two order of magnitude smaller than the value of w(9) at 
9 = 9.4 degrees. Thus the integral constraint does not bias 
our clustering measurements. 



2.4 Spatial correlation function 

We are interested in the spatial clustering and the physical 
separations at which galaxies are clustered, in order to com- 
pare data against theory. To this end, we need to calculate 
the spatial correlation function from our angular correlation 
function, which is simply its projection on the sky. The spa- 
tial correlation function, £(r), can be also expressed as a 
power law 



m = - 

\r 



(5) 



where ro is the correlation length. It corresponds to the 
proper separation at which the probability of finding two 
galaxies is twice that of a random distribution, £(?"o) = 1- 
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iLimberl (|l953l ) demonstrated that the power law approxima- 
tion for £(r) in equation [5] leads to the power law defined in 
equation [2] with the i ndex 7 being the same in both cases. 
IPhillipps et al,l (|l978l ) expressed the amplitude of the cor- 
relation function, A w , as a function of the proper correla- 
tion length, ro, and of the selection function of the survey, 
whereas later studies propose similar equations where the 
selection function is implicitly included in the redshift dis- 
tribution. 

Now, writing the angular correlatio n function as w (9) = 
A-ujd 1 Limber's equation becomes l|Peebles! Il98d . § 52, 
56): 



-A-w — C 



rZg{z){dN/dzfdz 
\J£»(dN/dz)dz]' ' 



(6) 



where dN/dz is the redshift distributiorQ, which is zero ev- 
erywhere outside the limits z m i„ and z max and 

c _ T i/ a r[(7-i)/2] 
c -* r( 7 /2) 

with r the gamma function. The quantity g(z) is defined as 



F(x) 



where F(x) is related to the curvature factor k in the 
Robertson- Walker metric by: 

F{x) = 1 - kx 2 . 

We assume zero curvature, and so F(x) = 1. 

When using equation [5] we need to determine the red- 
shift distribution of the sample with precision. We address 
this issue in Section T4. 31 Another subtle complication which 
arises from the use of equation [5] is that galaxy clustering 
is assumed to be indep endent of gala xy properties such as 
colour and luminosity l|Peebleslll980l . § 51). Therefore it is 
particularly important to use samples with fixed colour and 
luminosity, instead of mixed populations for studying galaxy 
clustering using Limber's approximation. We address this is- 
sue in Section T4. 21 where we define the colour and luminosity 
bins for the clustering analysis. 



3 DATA 

To carry out this analysis, we take adv antage of the Galax y 
and Mass Assembly (GAMA) survey (|Driver et al.ll201ll ). 
This spectroscopic sample, at low to intermediate redshifts, 
forms an ideal training set for predicting photometric red- 
shifts of faint galaxies. The galaxies considered for the cal- 
culation of the correlation functions are drawn from the sev- 
enth data release of the Sloan Digital Sky Surve y photomet- 
ric sample (SDSS DR7: lAbazaiian et al .1 [200^ . We briefly 
outline the properties of these samples below. 



3.1 SDSS DR7 photometric sample 

At the time of writing, the Sloan Digital Sky Survey (SDSS) 
is the largest local galaxy survey ever undertaken. The com- 
pleted SDSS maps almost one quarter of the sky, with optical 
photometry in u, g, r, i and z bands and spectra for ~ 10 6 
galaxies. The main goal of the survey is to provide data for 
large-scale structure studies of the local universe. A series of 
papers describe the survey: technical informatio n about the 
data p roducts and the pipeline c an be found in lYork et al.l 
(2000) and in IStoughton et al.l ll2002f). Details ab o ut th e 
photometric system can be found in Fukugita et~ai] (|l996l ). 

The SDSS im aging survey is comple ted with the sev- 
enth data release l|Abazaiian et al]|2009T ). that we use in 
this paper. The main program of SDSS is concentrated in 
the Northern Galactic cap with three 2.5° stripes in the 
Southern Galactic cap. SDSS DR7 contains about 5.5 x 10 6 
galaxies with r pct ro < 19.4 over 7,646 deg 2 of sky. 

The images are obtained with a 2.5-meter telescope, 
located at Apache Point Observatory, New Mexico. Various 
flux measures are availa ble for galaxies in the SDSS database 
jStoughton et al.l l2002h . including Petrosian fluxes, model 
fluxes (corresponding to whichever of a de Vaucouleurs or 
exponential profile provides a better fit to the observed 
galaxy profile), and aperture fluxes. In this paper we use 
model magnitudes to calculate galaxy colours and Petrosian 
magn itudes to split galaxi es in absolute magnitude ranges. 
After ISchlegel et all |l99St ). we correct the magnitudes with 
dust attenuation corrections provided for each object and 
each filter in the SDSS database. 

The star-galaxy classification adopted by the SDSS pho- 
tometric pipeline is based on the difference between an ob- 
ject's PSF magnitude (calculated assuming a point spread 
function profile, as for a stellar source) and its model mag- 
nitude. An o bject is then classified as a galaxy if it satisfies 
the criterion |Stoughton et al]|2002h 



''psf ,tot 



- m mo del,tot > 0.145, 



(7) 



where m p8 f,tot and m mo del,tot magnitudes are obtained from 
the sum of the fluxes over ugriz photometric bands. This cut 
works at the 95 per cent confidence level for galaxies with 
r < 21. In Section 13.21 we discuss a different star-galaxy 
classification, following the GAMA survey, which is the one 
we adopt for this work (see also Appendix fA")) . 

A photometric redshift study can be vulnerable to con- 
tamination not only due to stars misclassified as galaxies, 
but also to contamin ation due to over-deblended sources 
jScranton et al.l l2002h . usually coming from local spiral 
galaxies. This imposes limits on the angular scale over we 
can probe the correlation function. In order to test for this 
systematic in our sample, in Appendix IB4I we visually in- 
spect random samples of the data and then we model the 
contamination as a function of angular separation. 



3.2 GAMA sample 

The Galaxy and Mass Assembly (GAMA) projeclQ is a com- 
bination of several ground and space-based surveys with the 
aim of improving our understanding of galaxy formation and 



1 We use the expressions dN/dz and N(z) interchangeably for 
the redshift distribution. 
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evolution (|Driver et al.l l201ll l. GAM A uses the AAOmega 
spectrograph of the Anglo-Australi a n Telescope (AAT ) for 
spectroscopy (jSaunders et al.l 120041 : Isiiarp et alJl2006h . Its 
targets are selected from the SDSS pho tometric samp l e. Tar - 
get selection is described in detail by iBaldrv et alj l|2010l ). 
The main restriction is that the source is detected as an ex- 
tended object: r ps f — r mo dd > 0.25. As shown in AppendixlAl 
this criterion is also adopted for our sample extraction 
from SDSS. This criterion is more restrictive, in the sense 
that fewer stars will be mis-classified as galaxies, than the 
star-galaxy classification adopted by the SDSS photomet- 
ric pipeline (previous Section), but similar to that used for 
the S DSS main galaxy spectroscopic sample (|Strauss et al.l 
l2002h . 

The GAMA survey is almost 99 per cent spectroscop- 
ically complete ove r its 144 deg 2 area to r pct ro = 19.4 mag 
|Priver et al.|[201ll ). GAMA phase 1 (comprising 3 years of 
observations) includes 95,592 reliable spectroscopic galaxy 
redshifts to this magnitude limit, extending to redshift 
z ~ 0.5. Of these redshifts, 76,360 have been newly-acquired 
by the GA MA team. The rest c ome from previous sur- 
veys: SPSS (lAbazaiian et alj | 2009h. 2dFGRS (IColless et all 
l200ll: ICole et al.ll2005l ). 6dFGS i Jones et al.1 \2004l, MGC 
( Driver et al.ll2005l ) and 2SLAQ (|Cannon et al.N 2006). The 
overall GAMA red shift distribution is shown in Fig. 13 of 
iDriver et all (|201ll ). 

For a consistent training of ANNz it is necessary to 
match all the GAMA objects with S PSS DR7 iibercal pho- 
tometry (IPadmanabhan et al.1 I2OO8T ) and perform identical 
colour cuts. Once we apply the colour cuts (Section l3.3|l nec- 
essary for the optimization of ANNz performance, and low 
and high redshifts cuts (0.002 < z < 0.5), 93,584 redshifts 
remain. They are used to train our photometric redshift neu- 
ral net algorithm as described in Section 2] 



3.3 Colour cuts 

Before we build our final sample from ANNz, we remove 
galaxies with outlier u — g, g — r, r — i, i — z colours both 
in the SPSS imaging sample and in the training set, be- 
cause photometric redshift estimates are based primarily on 
these colours. The complete colour and magnitude cuts are 
given in Table [T] Less than 1 per cent of the galaxies are 
affected by the colour cuts. These colour cuts in principle 
could affect the mask that we use for correlation function 
calculations. To estimate the extent of this effect we study 
the distribution on the sky of the colour outliers as well 
as their angular correlation function. This exercise reveals 
that colour outliers have a spurious correlation an order of 
magnitude larger on all angular scales than the correlation 
function of our final sample. However, since the number of 
these objects is almost three orders of magnitude less than 
the total, they would have a negligible effect on w(9) mea- 
surements if included. 



3.4 Final sample 

Our aim is to obtain a galaxy sample with photometric prop- 
erties as close as possible to our training set. To this end, 
we have selected galaxies from the SPSS PR7 photometric 
sample with the query used to select GAMA targets (Ap- 



Table 1. Colour and apparent magnitude cuts for the optimiza- 
tion of ANNz. All magnitudes are SDSS model magnitudes. 



12.0 < r petro < 19.4 
-2 < u - g < 7 
-2 < g - r < 5 
-2 < r — i < 5 
-2 < i - z < 5 



pendix[X]|. We select galaxies which have "clean" photom- 
etry according to the instructions given on the SPSS web- 
sitqj. Our sample is hence limited by r pc t ro < 19.4 and sat- 
isfies the criterion for star-galaxy separation r ps f — r mo( j c i > 
0.25. In our analysis, we choose to calculate the correlation 
function for galaxies located in the SPSS northern cap, cor- 
responding to 92 per cent of SPSS PR7 galaxies. As such, 
the geometry of the survey is simplified to a contiguous area. 
Our final sample, after the colour cuts given in Table[T]com- 
prises 4,890,965 galaxies. 

To evaluate the number of data-random and random- 
random pairs in equation [3] we need to build a mask for 
our sample. The mask precisely defines the sky coverage of 
the sample. We use the file lss _combmask . dr72 .ply in the 
NYU Value Added Catalogu^ (|Blanton et al.ll2005bl ). map- 
ping SPSS stripes, as our mask. This file contains the coor- 
dinates of the fields observed by SPSS expressed in spherical 
polygons, excluding areas around bright stars because galax- 
ies in these regions can be affected by photometric errors. It 
is also suitably formatted for use with the mangle software 
llHamiltonlll993l : lHamilton fc TegmarklkoO^ ISwanson et~afl 
l2008bT ), a tool for manipulating survey masks and obtaining 
random points with the exact geometry of the mask. Once 
masking is applied, 4,511,011 galaxies remain in our sample. 

The upper panel of Fig. [T] shows the boundaries of the 
final mask for SPSS PR7 that we use for creating random 
catalogues. Our random catalogues consist of ~ 10 7 objects, 
approximately ten times larger than the number of galax- 
ies in each luminosity and colour bin. Consistency checks 
have shown that our clustering results are not sensitive to 
any particular realization of the random catalogue. In Ap- 
pendix IB II we check the accuracy of the survey mask, as 
well as the photometric uniformity of the sample, by study- 
ing the angular clustering of our sample as a function of 
r-band apparent magnitude. 



3.5 Pixelisation scheme and jackknife resampling 

In order to speed up the computation of the correlation func- 
tion, we pixelise our data according to the SBSSPbQ scheme. 
The basic concept consists of assigning galaxies located in a 
portion of the sky to a pixel. After this step, we only need 
to take into account galaxies in the same pixel and in the 
neighbouring pixels to calculate the correlation function up 
to the scale of a pixel. SPSSPix divides the sky along SPSS 
r] and A spherical coordi nates (as defined in Section 3.2.2 
of IStoughton et al.1 [2002) in equal spherical areas. Piffer- 
ent resolutions are available according to the angular scale 



3 http : //www . sdss . org/dr7/products/catalogs/f lags .html 

4 http : // sdss .physics .nyu. edu/vagc/ 

5 http : //dls .physics .ucdavis . edu/~ scranton/SDSSPix/ 
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80 




60 10 20 30 40 50 60 70 
pixel 

Figure 1. The upper panel shows the jackknife regions used for 
the error estimation of our correlation function measurements. Af- 
ter modifying the SDSSPix scheme, there are 80 jackknife regions 
which contain approximately equal numbers of random points. 
The lower panel reports the normalized area of each pixel, based 
on a random catalogue. The deviations from uniformity show that 
differences in the areas of the JK regions are limited to ±30 per 
cent at most. 



of interest. We choose the resolution called basic resolution 
(resolution = 1). This divides the sky in 468 pixels of size 
~ 9.4 x 9.4 deg. Then, for galaxies in a given pixel, that pixel 
and its 8 direct neighbouring pixels include all neighbour- 
ing galaxies with separations up to 9.4 degrees, the largest 
angular separation we consider (see Section [5]). 

We also use this pixelisation scheme to define the Jack- 
knife (JK) regions for the error analysis. In order to minimize 
the variation in the number of objects in each JK region, 
some neighbouring pixels that contain the survey bound- 
ary are merged in order that they contain a more nearly 
equal number of random points. This modification of the 
SDSSPix pixelisation yields 80 JK regions, as shown in the 
upper panel of Fig. [T] The lower panel of Fig. [1] presents 
the relative variation in area of each region, as measured 
by the relative number of randoms each one contains. Here- 
after, errors on w(ff) are determined from 80 JK resamplings, 
by calculating w(9) omitting each region in turn. We have 
checked that our results are not significantly affected by us- 
ing either 104 or 40 Jackknife regions. The elements of the 
covariance matrix, C, are given by: 

N 

Ca = n ^^(log(^)-log(^))(logK fc )-log(» J )), (8) 

where is the angular correlation function of the k th JK 
resampling on scale Qt , uii the mean angular correlation func- 
tion and N the total number of JK resamplings. In practice, 
Wi is identical with the angular correlation function mea- 
surement from the whole survey area. The N — 1 factor in 
the numerator of equation[8]ac counts for co rrelations inher- 
ent in the jackknife procedure (|Millerlll974h . 




~*phol. 



Figure 2. Density /scatter plot of redshift error (spectroscopic 
minus photometric redshift) against predicted photo-2 from this 
work (top panel) and SDSS (middle and bottom panels). The 
colour coding is such that the densest area (black contour) is 5 
times denser than the white contour. Points are drawn whenever 
the density of points is less than 10 per-cent of the maximum 
(black contour). The red squares and error bars represent the 
mean redshift errors and their standard deviations in photo- z bins 
of width Azph ot = 0.05. Horizontal red lines show the zero error 
benchmark. The improvement in photometric redshift estimates 
in this work, due primarily to use of the representative GAMA 
training set, is clear. 



Jackknife is a method of calculating uncertainties on 
a quantity that that we measure from the data itself. In 
wide-field galaxy surveys, more often than not, large super- 
structures appear to significantly influence clustering mea- 
surements. The b est known example is the SDSS Great Wall 
jGott et alj|2005t) . The presence of such structures makes it 
tempting to present the results with and without the JK re- 
gion that encloses them, as done in th e clustering studies o f 
IZehavi et all (|2005l . l201 if ). Better still. iNorberg et all (l201lf ) 
devise a more objective method to consistently remove out- 
lier JK regions, from the distribution of all JK measurements 
that one has at hand. We follow that method in the present 
analysis, and find that for all samples considered, the num- 
ber of JK regions that are outliers, and therefore removed, 
is mostly two or three and no more than five. 



4 PHOTOMETRIC REDSHIFTS 

For the clustering measurements presented in this paper, 
all distance information comes from photometric redshifts 
(photo-z). Photo-zs are the basis for estimating the red- 
shift distributions to be used in equation [6] and in estimat- 
ing distance moduli to calculate absolute magnitudes and 
colours. For this study we have a truly representative sub- 
set of SDSS galaxies down to r < 19.4 and we therefore use 
the artificial neural netw ork package ANNz developed by 
ICollister fc Lahavl l|2004h to obtain photo-z estimates. 

It is important that the training set and the final galaxy 
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sample from SDSS are built using the same selection crite- 
ria. The input parameters are the following: ubercalibrated, 
extinction-corrected model magnitudes in ugriz bands, the 
radii enclosing 50 per cent and 90 per cent of the Petrosian 
r-band flux of the galaxy, and their respective uncertain- 
ties. The architecture of the network is 7:11:11:1, with seven 
input parameters described above, two hidden layers with 
11 nodes each and a single output, the photo-z. We use a 
committee of 5 networks to predict the photo-zs and their 
uncertainties (see Section [XTJ . 

4.1 Photometric redshift errors 

Before we proceed with the photo-z derived quantities that 
we use in this study, we investigate the possible biases and 
errors that ANNz introduces, using the known redshifts from 
GAMA. Following standard practice we split our data into 
three distinct sets: the training set, the validation set and 
the test set. Half of the objects constitute the test set and the 
other two quarters the training and validation sets. This in- 
vestigation is insensitive to the exact numbers in these three 
sets. The training and validation sets are used for training 
the network, whereas the test set is treated as unknown. 
Given predicted photo-zs z p h ot , we can quantify the redshift 
error for each galaxy in the test set as 

Sz = Z s p cc Zphot, (9) 

the primary quantity of interest as far as true redshift er- 
rors are concerned. It can depend on apparent magnitude, 
colour, the output z p hot, the intrinsic scatter z crr of ANNz 
committees, as well as the position of an object on the sky if 
the survey suffers from any photometric non-uniformity. We 
investigate some of these potential sources of error below. 
The dispersion a z , of Sz is given by the equation 

a 2 z = ((Sz) 2 ) - ((Sz)) 2 , (10) 

and is found to be a z — 0.039. The standard deviation for 
the redshift range < z p hot < 0.4, within which we choose 
to work, is a z — 0.035. 

In Fig. [2] we compare our photo-z estimates with 
the publicly availa ble photo-z from the SDSS website 
|Ovaizu et al.ll2008l . tables photozl and photoz2). For this 
comparison we plot the redshift error as a function of photo- 
z. We then calculate the mean and the standard deviation 
of Sz for photo-z bins of width Az p hot = 0.05. The number 
of catastrophic outliers (galaxies with |z p hot — z spcc | > 3<r z ) 
for the GAMA calibrated photo-z is 1 percent or less for all 
photo-z bins. We work in fixed photo-z bins, because all our 
derived quantities are based on the photo-z estimates. This 
way, any biases with estimated photo-z are readily apparent. 
Our results based on the GAMA training set outperform the 
SDSS results — for the redshift range 0.01 < z p hot < 0.4, we 
obtain essentially unbiased redshift estimates, given the ob- 
served scatter. The scatter, in turn, increases with redshift. 
We note, however, that the photoz2 catalogue from SDSS 
DR7 has been improved with the addition of p(z) estimates 
which are designed to perform much better in recovering the 
total redshift proba bility distribution function of all galaxies 
|Cunha et aljfioo^ ) . Since it is still not clear how to directly 
relate a redshift pdf to absolute magnitude and colour for a 
given galaxy, our approach for the study of luminosity- and 
colour-dependent clustering is easier to interpret. 



Table 2. The change in the total number of galaxies as a result 
of the cuts applied in various stages of the analysis. 



Cut description Number of galaxies left 

None 4,914,434 
Colour cuts (Table QJ 4, 890, 965 

Masking 4,511,011 

4^ NNz) < 0-05 fc 0.002 < z pho t < 0-4 4, 289, 223 



In Appendix IB2I we quantify the photo-z error and 
possible contamination between redshift bins by cross- 
correlating photo-z bins which are more than 2a z apart. 
We find, as expected, that the residual cross-correlation of 
the different photo-z bins is negligible compared to their 
auto-correlation. 

The distribution of photo-z errors is in general non- 
Gaussian, albeit less pronounced in the case of a complete 
training set. Photo-z errors also propagate asymmetrically 
in absolute magnitude: for a given redshift error, the error 
induced in absolute magnitude is larger at low-z and smaller 
at high-z, and thus a photo-z analysis is more tolerant to 
redshift errors for objects at high-z. For that reason, it is 
common practice to scale the redshift error by the quantity 
1/(1 + z p hot). Taking into account this redshift stretch, uq 
can be defined as 

ao2 = ((rr|^) )-((t+|^)) ' 

giving do = 0.032. 

We exclude from our analysis galaxies with z p hot < 
0.002 or z p hot > 0.4. ANNz provides a photo-z error cal- 
culated from the photometric errors. Using our test set, we 
find that this error underestimates the true photo-z error 
(given from equation [5J. We therefore apply a cut on the 
output parameter z crr of ANNz at z crr < 0.05. These cuts 
eliminate ~ 4 per cent of the galaxies. Cross-checks show 
that the correlation function measurements do not change if 
we use a less strict cut, but the chosen cut does improve the 
N(z) estimates. The final number of galaxies after this cut 
is 4,289,223. We summarize the changes in the number of 
galaxies in our sample in Table [2] We use Petrosian magni- 
tudes to divide galaxies by luminosity and model magnitudes 
to calculate galaxy colours. 

The photo-z work presented h ere is similar, but not 
identical, to that of I Parki nson (2012]). The latter is appropri- 
ate for even fainter SDSS magnitudes as it uses, in its train- 
ing and validation, all GAMA g alaxies with r ve t ro < 19-8 
and fainter zCOSMOS galaxies (|Lillv et all 120071 ) matched 
to SDSS DR7 imaging. Minor differences in the two photo-z 
pipelines, such as the inclusion of different light profile mea- 
surements, do not significantly affect the estimated photo- 
z, which present a similar scatter around the underlying 
spectroscopic d istribution. Our photo-z agree with those of 
iParkinsonl <|2012t ) within the estimated errors. 

4.2 Division by redshift, absolute magnitude and 
colour 

Galaxy magnitudes are k + e -corrected to z D hot = 0- 1, us- 
ing KCORRECT version 4.1.4 (Bla nton fc Roweisl bOOn - ) and 
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-24<M r -5log 10 h<-22 



Figure 3. r-band absolute magnitude against photo-z for our 
photometric sample. Solid red lines show the boundaries of our 
samples in photo-z and absolute magnitude and dashed lines the 
further split in absolute magnitude bins. Only 1 percent of the 
galaxies are shown. 




Figure 4. r-band absolute magnitude against 01 (g — r) colour 
(both fc-corrected and passively evolved to z = 0.1) for galax- 
ies split in photo-z bins. Solid red lin es show the colour cu t for 
red and blue populations suggested bv lLovedav et aL I II2012D and 
used in this work, while dashed red lines the colour cut used by 
IZehavi et aD l|201lh . 
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Figure 5. Redshift error against photo-z for our luminosity and 
colour-selected GAMA subsamples. The mean redshift error and 
standard deviation in bins of photo-z are shown by the coloured 
squares and error bars, while the root mean square standard devi- 
ation, 0" rms , is listed in each panel. The faint red sample has been 
omitted due to the small number of galaxies that it contains. 



the pa ssive evolution parameter Q = 1.62 of iBlanton et al.l 
(2003). In this simple model, the evolution-corrected ab- 
solute magnitude is given by M COII = M — Q(z — zo), 
where zp = . 1 is the reference redshift. We note that 
lLovedav et"aH (|2012l ) using GAMA found Q = 0.7, which 
would change evolution-corrected magnitudes by w 0.3 mag 
at z = 0.4. Approximately equal deviations in absolute mag- 
nitude will be induced in our high-z blue galaxy samples , 
if we use a colour-dependent Q (e.g. lLovedav et al.ll2012l ). 
Assuming a global value for Q however allows for a more 
direct comparison with the SDSS-based clustering studies 
of Zehavi et al. (2005, 2011). Galaxy colours, derived from 



SDSS model magnitudes, are referred to as °' 1 (g — r), while 
absolute magnitude are derived using the r-band Petrosian 
magnitude (to match the GAMA redshift survey selection). 
Fig. [3] shows that the r-band absolute magnitude extends 
to M r — 51ogh, = — 16 mag with a few galaxies reaching as 
faint as M r — 5 log ft = — 14 mag. 

We split our galaxy sample in photo-z as well as lu- 
minosity bins. Our samples are shown in Fig [3] Initially we 
define four photo-z bins in the redshift range < z p hot < 0.4 
and then we further split each photo- z-defined sample into 
six absolute magnitude bins in the range —24 < M r — 
5 log h < —14. Thus our photo-z catalogue offers the op- 
portunity for a clustering analysis over the luminosity range 
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0.03L* < L < 8L*, spanning almost three orders of magni- 
tude in L/L* . 

In Fig [3] some of these redshift-magnitude bins extend- 
ing beyond the survey flux limit are only partially occupied 
by galaxies in terms of photometric redshifts and photo-z 
derived absolute magnitudes. The true redshift and abso- 
lute magnitude distributions for each bin are recovered by 
Monte-Carlo resampling, as discussed in Section 4.3. 

Fig. U shows colour-magnitude diagrams for our sample 
split in photo-z bins. The colour bimodality is evident at 
01 (9 ~~ r ) — 0-8 for all phot o-z bins. We have ad opted the 
tilted colour cuts defined bv lLovedav et ail (|2012l ). 



M r - 5 log h = 5 - 33.3 x 01 (g - r) Y 



(12) 



wh ich is a sligh t ly mo dified version of the colour cut used 
by IZehavi et al.1 (201ll ). also shown in Fig.g] 

In Fig. [5] we plot the photo-z error against photo-z for 
galaxies subdivided into subsamples, where we again have 
used photometric redshifts to estimate galaxy luminosities 
and colours. There are no obvious systematic biases of z spcc — 
Zphot for any of the subsamples, although we do note that 
the most luminous (faintest) bin contains very few blue (red) 
galaxies. 

The relatively good photo-zs notwithstanding, our anal- 
ysis does not eliminate completely the main systematic error 
of neural network derived photo-z, which is the overestima- 
tion of low redshifts and the underestimation of high red- 
shifts (see e.g. Fig. 7 of ICollister et ail 120071 ). As a result, 
a number of faint galaxies have their redshift overestimated 
and hence appear brighter in our sample. We note that there 
is a discrepancy between the fraction of faint red objects 
in the lumino sity bin — 19 < M r — 5 log ft < —17 between 
this work and lZehavi et all (|201ll ). which is most probably 
caused by this systematic shift (see Tabled}. It is possible to 
cure this by Monte-Carlo resampling the photo-zs with their 
respective errors and then rederive the absolute magnitudes 
and colours, but we do not pursue this here. 



4.3 Photometric redshift distribution(s) 

Despite the fact that ANNz gives fairly accurate and un- 
biased photo-zs for calculations in broad absolute magni- 
tude bins or photo-z bins, in order to translate the two 
dimensional clustering signal to the three dimensional one 
using equation [6] the underlying true dN/dz is needed. In 
this w ork we loos ely follow the appr oach given in lParkinsonl 
i|2012l ). (see also iDriver et all 120111 ). The GAM A spectro- 
scopic sample is highly representative and it allows us to 
calculate the true redshift errors as a function of photo-z for 
all objects in GAMA with r po t ro < 19.4. Then, under the 
assumption of a Gaussian photometric error distribution in 
each photo-z bin, we perform a Monte-Carlo resampling of 
the ANNz predictions for photo-zs. This is equivalent to re- 
placing each photo-z derived from ANNz with the quantity 
zmc drawn from a Gaussian distribution, using a photo-z 
dependent standard deviation, a(z^ ln ^ 



'phot ) 



S (bin). 



ZMC = G[fJ, — Zphot, CT — <7 p hot(l + Zphot)]- 



(13) 



Note that convolving the imprecise photo-z with additional 
scatter improves the N(z) redshift distribution: in other 
words the photo-z process deconvolves the N(z) and makes 
it artificially narrow. 
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Figure 6. Estimates of the underlying redshift distribution for 
the luminosity samples used in the clustering analysis. Thin solid 
lines show the photo-z distribution, which is the basis for the 
selection, dotted lines the true spectroscopic redshift distribution 
from GAMA and thick solid line the average distribution inferred 
from 100 Monte-Carlo resamplings of the photo-z distribution 
using equation 1131 
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Figure 7. The r-band absolute magnitude distribution for 
GAMA galaxies with r p ctro < 19.4 split into photo-z and photo-z- 
derived absolute magnitude slices. Magnitude distributions shown 
by dashed lines are derived from the raw photo-z, by thin lines 
from the underlying spectroscopic redshifts and by thick lines 
from the Monte-Carlo derived magnitudes. The latter reproduces 
the true underlying spec-z inferred magnitude distribution rather 
well; however for a few samples there is a discrepancy between 
the spec-z-derived and the Monte-Carlo-derived distributions. All 
MC absolute magnitude estimates are i<"-corrected and passively 
evolved following the procedure described in Section 14.21 
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All our sample selections in Fig. [6] have been made us- 
ing the photo-z derived absolute magnitude M r — 51og/i. 
We then use the accurate spectroscopic information from 
GAMA to assess how well Monte-Carlo resampling compares 
to the underlying true dN/dz. Since the GAMA area is much 
smaller than the SDSS area, we do not wish to recover the 
exact spectroscopic redshift distribution, merely to match 
a smoothed version thereof. Our test shows that MC re- 
sampling performs rather well in recovering the true dN/dz. 
This method performs even better with a larger number of 
objects, which indicates that we are still dominated by sta- 
tistical errors and therefore there is room for improvement in 
future when larger spectroscopic training sets will be avail- 
able. Nevertheless, as an incorrect redshift distribution can 
cause a systematic error in ro, in Appendix IB3I we test the 
sensitivity of our results to the assumed dN/dz, and com- 
pare results using the Monte-Carlo recovered dN/dz with 
those from the weighting method proposed bv lCunha et al.l 
(|2009f ). 

Fig.[7]shows, for all samples split by photo-z and photo- 
z-derived absolute magnitude, the photo- z-derived, the true 
underlying and the Monte-Carlo inferred absolute magni- 
tude distributions (as dashed, thin and thick solid lines re- 
spectively). We note that the photo-z derived absolute mag- 
nitude estimates in Fig. [7] are obtained from the resampled 
redshifts and not by resampling the absolute magnitudes per 
se. We then fe + e-correct every Monte-Carlo absolute magni- 
tude realization using the procedure described in Section fOl 
As expected, the true underlying distribution extends well 
beyond the photo-z inferred luminosity bins, but is yet again 
rather well described by the Monte-Carlo inferred distribu- 
tion. 

It is crucial that we have a good understanding of the 
true underlying absolute magnitude for all our samples. 
For galaxy clustering studies with spectroscopic redshifts 
it is desirable to work with volume-limited samples. Using 
photometric redshifts, however, one can form only approxi- 
mately volume-limited samples, since photo-z uncertainties 
will propagate into absolute magnitude estimates. Essen- 
tially, any tophat absolute magnitude distribution, as se- 
lected using photo-z, corresponds to a wider true absolute 
magnitude distribution, as shown in Fig. 7. This is rather 
similar to selecting galaxies from a photometric redshift bin 
and then convolving the initial tophat distribution with the 
photo-z error distribution in order to obtain the true N(z). 
However, using the w(6) statistic and an accurate dN/dz for 
that particular galaxy sample we can extract its respective 
spatial clustering signal, which would then correspond to the 
zmc derived absolute magnitude. Direct comparisons with 
other studies can then be made, modulo the extent of the 
overlap between the two absolute magnitude distributions. 



5 RESULTS FOR THE TWO-POINT 
CORRELATION FUNCTION 

5.1 Luminosity and redshift dependence 

We first calculate the angular correlation function w(8) for 
our samples selected on absolute magnitude and photomet- 
ric redshift over angular scales from 0.005 to 9.4 degrees, 



in 15 equally spaced bins in log(0j!l In a flux-limited sur- 
vey like SDSS, intrinsically bright galaxies dominate at high 
redshifts and intrinsically faint objects dominate at low red- 
shifts (see Fig. [4|. For that reason, we calculate w(6) for 
the 17 well-populated samples given in Table [3] Errors are 
estimated using the jackknife technique, with the covariance 
matrix given by equation [8] Even if the validity of a given 
error method based on data alone is still widely debated, 
it is commonly accepted that the jackknife m ethod is ad- 
equa te for angular clustering studies (see e.g. ICabre et al 



2007), while for 3-D clustering measurements. iNorberg et al 



(2009) have shown that the jackknife method suffers from 
some limitations, in particular on small scales. 

Our angular correlation function measurements are 
broad and probe both highly non-linear and quasi-linear 
scales. Fig. [S] presents galaxy angular correlation functions 
for six photo-z selected absolute magnitude bins. We show 
the angular scale (lower :r-axis), used for the correlation 
function estimation, and the corresponding comoving scale 
estimated at the mean redshift of the sample (upper x-axis). 

Over the range of angular scales fitted, chosen to cor- 
respond to approximately 0.1-20 Mpc comoving sepa- 
ration according to the mean redshift of each sample, the 
angular correlation function can be reasonably well approx- 
imated by a power law, equation [2] We perform power law 
fits, both with the full covariance matrix and with the diag- 
onal elements only. The power law fits for our L* sample are 
shown in Fig. [5] Dotted lines in Fig. [S] show the extension 
of the power laws beyond the scales over which they were 
fitted. The resulting correlation lengths, ro, slopes, 7, and 
quality of the fits as given by the reduced x 2 , xt, for all 
samples are listed in Table 

The luminosity dependence of galaxy clustering is 
present in all photo-z shells: the shape and the amplitude of 
the angular correlation function differ for galaxies with dif- 
ferent luminosity. The amplitude of the angular correlation 
function decreases as we go from bright to faint galaxies 
for all photo-z bins. The slope of the correlation function 
also decreases with decreasing luminosity, very much in line 
with the change in the fraction of red and blue galaxies. As 
observed in Section 15.21 red (blue) galaxies dominate the 
brightest (faintest) luminosity bins, with red galaxies pref- 
erentially having a steeper correlation function slope than 
blue galaxies. 

For each sample, we estimate the correlation length ro 
via equation [6] using the Monte-Carlo inferred redshift dis- 
tribution described in Section \4. 31 The redshift distribution 
dN/dz is calculated separately for each sample, as shown 
in Fig [6] In Appendix IB3I we investigate the effects of the 
assumed dN/dz on the recovered correlation length ro, and 
show that the adopted dN/dz recovery method compares 
favourably with the true underlying dN/dz, as obtained 
from the smoothed dN/dz spec . 

For our luminosity bins in the redshift range < z < 
0.1, the correlation length is found to decrease as we go 
to fainter absolute mag nitudes, from 8.21 ± 2.32 /i _1 Mpc 
(-22 < M r - 5 log h < -21) to 4.28 ± 1.56ft _1 Mpc (-19 < 



6 Initially our analysis was done down to 8 = 0.001 degrees. How- 
ever, as shown in Section 15.31 and Appendix IB4I the data is not 
reliable enough on such small scales. 
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Figure 8. Two-point angular correlation functions w(9) of our samples split into photo- z bins and six photo- z-infcrrcd absolute magnitude 
bins, as indicated in each panel, with jackknife errors. The solid lines show power law fits estimated using the full covariancc matrix for 
the L* sample. Dotted lines show the extension of the power law fits on scales < 0.1h~ 1 Mpc and > 20/i -1 Mpc. 



M r — 5 log h < —17). This is very much in line with the 
recent results of IZehavi et alJ ()201ll ). Moreover, we do not 
observe strong evolution with redshift for samples of fixed 
luminosity. All m and 7 measurements are shown in Fig. [9] 

There are two main sources of error in the rn estimates: 
(a) the correlated uncertainties on the power law parame- 
ters 7 and A w which propagate through equation|6]to rn; (b) 
statistical and systematic uncertainties in the modelling of 
the underlying redshift distribution. The w{6) uncertainties 
and the induced error on rn and 7 are obtained using the 
standard deviation from the distribution of JK resampling 
estimates (Section l3.5p . As in the case of the covariance ma- 
trix, these uncertaint ies are multiplied by a factor of TV — 1 
l|Norberg et al.ll2009l ). The dN/dz uncertainties are investi- 
gated in great detail in Appendix IB3I where we show that 
the Monte-Carlo inferred dN/dz performs best, while still 
returning a residual systematic uncertainty of ±0.2ft _1 Mpc 



on rn that depends on the sample considered. We find that 
both sources of uncertainty have a comparable contribution 
to the errors. In Table [3] we quote the total error on the 
correlation length after adding the two (independent) errors 
in quadrature. 



5.2 Luminosity, redshift and colour dependence 

We repeat the clustering analysis splitting the samples into 
red and blue colour using equation 1121 For each new sample 
we re-estimate the underlying redshift distribution used in 
the inversion of Limbers equation. The corresponding 50 th , 
16* and 84* percentiles of the underlying absolute mag- 
nitude distributions are given in Tables U and [S] We also 
repeat the procedure outlined in Section 15.21 for the error 
estimation. 

In Fig. [10] we present the angular correlation functions 
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Figure 9. Left: Power law slope, 7, as a function of absolute magnitude and redshift. Right: Real space correlation length, ro, as a 
function of absolute magnitude and redshift. Absolute magnitude ranges for which ro and 7 measurements are valid are given in Table 
® 



Table 3. Clustering properties of luminosity-selected samples. Col. 1 lists the photo- z based absolute magnitude ranges, col. 2 the median 
absolute magnitude and the associated 16 th and 84 th percentiles from the Monte-Carlo resampling (Fig. [7)l and col. 3 the number of 
galaxies in each sample. Cols. 4, 5 and 6 list respectively the slope, 7, the correlation length, ro, and the reduced x 2 , X?> °f the power 
law fit as defined in Section 12.41 Cols. 7, 8 and 9 show the same information but for power law fits using only the diagonal elements of 
the covariance matrix. All power law fits are approximately over the comoving scales 0.1 < r < 20 h" 1 Mpc. Finally col. 10 presents the 
relative bias at 5 h" 1 Mpc measured using equation 1141 



Sample 
Mr — 5 log h 


Magnitude( MC ) 
Mi — 5 log h 




7 


ro 

[A- 1 Mpc] 


X. v 


7 (d) 


JA) 
r o 

[ft -1 Mpc] 


x (d) i 


b/b* 
[ft _1 Mpc] 


All colours 0.3 < z phot < 0.4 


[-24, 


-22) 


—99 n — U -' J 

"•"+0.2 


13257 


2.01 ± 0.15 


14.08 ± 2.09 


3.41 


2.02 ±0.09 


13.68 ± 1.22 


2.6 


2.13 ±0.30 


[-22, 


-21) 


—91 9 — 0,3 
ZJ "' i +0.3 


339834 


1.94 ±0.11 


8.23 ± 1.54 


28.08 


1.91 ±0.09 


8.46 ± 1.06 


13.0 


1.22 ±0.22 


[-21, 


-20) 


-20.8 +0 ; 2 


158860 


1.75 ± 0.06 


6.96 ±0.56 


3.76 


1.78 ±0.05 


6.80 ±0.33 


1.8 


1.00 ±0.01 


All colours 0.2 < z phot < 0.3 


[-24, 


-22) 


—99 n — u ' 3 


12294 


2.02 ± 0.11 


13.29 ± 2.01 


2.37 


2.01 ±0.07 


13.17 ± 1.13 


1.7 


2.02 ±0.32 


[-22, 


-21) 


-21 2 -0 ' 4 


284969 


1.92 ± 0.09 


7.92 ± 1.13 


10.91 


1.90 ±0.06 


8.12 ±0.70 


5.5 


1.17 ±0.17 


[-21, 


-20) 


"20.4;°i 


930539 


1.75 ± 0.05 


6.94 ±0.76 


7.96 


1.77 ±0.05 


6.74 ±0.36 


3.3 


1.00 ±0.03 


[-20, 


-19) 




122870 


1.75 ±0.08 


5.84 ±0.57 


2.44 


1.76 ±0.06 


5.84 ±0.29 


1.5 


0.86 ±0.10 


All colours 0.1 < Zp hot < 0.2 


[-24, 


-22) 


—99 n — u ' 4 

zz,u +0.3 


4311 


1.96 ± 0.09 


12.58 ± 1.35 


0.59 


1.95 ±0.08 


12.57 ± 1.13 


0.4 


2.10 ±0.35 


[-22, 


-21) 


-21 2 -0 ' 4 


106728 


1.92 ± 0.05 


7.31 ±0.60 


3.56 


1.92 ±0.04 


7.40 ± 0.32 


1.7 


1.22 ±0.18 


[-21, 


-20) 


"20.3+°j 


604181 


1.75 ± 0.05 


6.03 ±0.77 


7.16 


1.78 ±0.06 


5.85 ±0.43 


3.9 


1.00 ±0.05 


[-20, 
[-19, 


-19) 
-17) 




916563 
211336 


1.63 ± 0.11 
1.55 ± 0.08 


6.36 ± 2.42 
5.17 ±0.83 


42.40 
4.41 


1.71 ±0.10 
1.58 ±0.07 


5.81 ±0.75 
4.89 ±0.34 


11.7 
1.6 


1.03 ±0.30 
0.87 ±0.16 


All colours 0.0 < z phot < 0.1 


[-22, 


-21) 


-21 1~ U - 7 


19218 


1.89 ± 0.13 


8.21 ±2.32 


6.36 


1.88 ±0.07 


8.09 ±0.80 


1.6 


1.15 ±0.43 


[-21, 


-20) 


— 20.3 +0 .9 


122787 


1.68 ± 0.09 


7.31 ± 1.40 


9.00 


1.75 ±0.05 


6.84 ±0.50 


2.1 


0.99 ±0.23 


[-20, 


-19) 


-i9-4+8:i 


155147 


1.60 ±0.08 


6.23 ± 1.06 


9.08 


1.65 ±0.08 


6.10 ±0.64 


4.5 


0.86 ±0.20 


[-19, 


-17) 


-18.1^:8 


271389 


1.54 ±0.06 


4.33 ±0.58 


6.20 


1.58 ±0.09 


3.97 ±0.24 


2.9 


0.65 ±0.18 


[-17, 


-14) 


-16.6;?;! 


14659 


2.03 ± 0.25 


4.28 ± 1.56 


5.82 


2.00 ±0.28 


4.41 ± 1.03 


2.1 


0.62 ±0.25 
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Figure 10. Two-point angular correlation functions w(9) split by absolute magnitude and colour, with red circles (blue squares) showing 
the red (blue) sample. Colour gradients indicate the transition from bright (darker shade) to faint (lighter shade) luminosities. Lines are 
as in Fig. [8] The faintest (brightest) sample does not contain enough red (blue) galaxies to robustly estimate w(9). 



in each luminosity and photo- z bin, for red and blue galax- 
ies. The power law fits over approximately fixed comoving 
scales, their corresponding errors as well as the quality of 
the fits and the correlation length are estimated as in Sec- 
tion 15.11 and summarized in Tables [4] and [5] As noted ear- 
lier, the power law fits describe the clustering measurements 
quite well in a qualitative sense, although certainly not well 
enough in a quantitative sense, with most samples present- 
ing a typically too large reduced \ 2 ( see Tables [4] and [5} . 

For all absolute magnitude ranges, the red population 
displays a steeper correlation function slope than the blue 
one. Blue galaxies have a much shallower slope which gradu- 
ally decreases with luminosity until a sudden increase in the 
slope for the faintest luminosity range probed (Table [5J. 

The correlation length of red galaxies for all redshift 
bins presents a minimum value around M* , with increas- 
ing values both faintwards and brightwards (Table [4]). We 



note however, that this result comes with large uncertain- 
ties. For red galaxies the correlation lengths of the bright- 
est and faintest bin are comparable and faint red objects 
are more strongly clustered than red objects with inter- 
mediate luminosities. For the blue population rrj behaves 
more regularly (like the overall population), gradually de- 
creasing with luminosity and redshift. Blue galaxies gener- 
ally have smaller uncertainties as well. Our measurement 
of the correlation length for the faintest luminosity bin 
(rn = 4.17 ± 1.41/i _1 Mpc) indicates that these galaxies are 
similarly clustered to blue galaxies of intermediate luminos- 
ity. The robustness of this result and some caveats are dis- 
cussed in Section f5. 31 

Due to the complicated way that the slope and the 
correlation length, as well as their respective uncertainties, 
change between colour selected samples, we chose to study 
more quantitatively the clustering of these samples using 
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Table 4. Clustering properties of luminosity-selected red galaxies. Columns are the same as in Table [3] 



Sample 
M r - 5 log h 


Magnitude (MC) 
M r - 5 log h 




7 


ro 


xl 


7 « 


r o 

[fr- 1 Mpc] 


Y (d) 2 
A v 


b/b* 
[h- l Upc] 










Red 


0.3 < 2 p hot 


< 0.4 










[-24, 


-22) 


zz - u +0.2 


13095 


2.02 ± 0.15 


13.91 ±2.22 


3.01 


2.03 ±0.11 


13.65 ± 1.86 


2.4 


1.78 ±0.26 


[-22, 


-21) 


—91 9 _ °- 3 


287622 


1.98 ± 0.10 


8.40 ± 1.64 


24.60 


1.94 ±0.10 


8.71 ± 1.17 


13.7 


1.06 ±0.20 


[-21, 


-20) 


-20.7;°j 


79073 


1.86 ± 0.05 


8.19 ±0.54 


1.33 


1.88 ±0.05 


8.08 ± 0.40 


1.2 


1.00 ±0.01 










Red 


0.2 < 2 p hot 


< 0.3 










[-24, 


-22) 


— 22 _ua 
u +0.3 


12200 


2.02 ± 0.11 


13.33 ± 1.95 


1.89 


2.01 ±0.07 


13.24 ± 1.11 


1.8 


1.73 ±0.41 


[-22, 


-21) 


-21 2 -0 ' 4 


242452 


1.95 ± 0.10 


8.26 ± 1.31 


11.23 


1.92 ±0.06 


8.41 ±0.72 


6.0 


1.05 ±0.25 


[-21, 


-20) 


-20.5-0;! 


597678 


1.81 ± 0.06 


8.01 ± 1.20 


17.10 


1.84 ±0.06 


7.69 ±0.52 


6.5 


0.98 ±0.04 


[-20, 


-19) 




44588 


1.95 ± 0.09 


8.53 ± 1.30 


5.59 


1.91 ±0.08 


8.57 ±0.43 


2.8 


1.07 ±0.21 










Red 


0.1 < 2 p hot 


< 0.2 










[-24, 


-22) 


—99 n — u - 4 

zz,u +0.3 


4271 


1.96 ± 0.08 


12.61 ± 1.26 


0.47 


1.95 ±0.08 


12.57 ± 1.13 


0.4 


1.87 ±0.48 


[-22, 


-21) 


-21 2 -0 ' 4 


93975 


1.94 ±0.05 


7.56 ±0.71 


2.52 


1.93 ±0.04 


7.65 ±0.36 


1.6 


1.13 ±0.28 


[-21, 


-20) 


-20.3+°j 


393344 


1.78 ± 0.11 


7.07 ± 1.81 


17.30 


1.84 ±0.08 


6.68 ±0.64 


6.3 


1.03 ±0.10 


[-20, 
[-19, 


-19) 
-17) 




344815 
12942 


1.71 ± 0.20 
1.86 ± 0.18 


9.69 ±5.98 
17.86 ±4.26 


82.81 
9.69 


1.85 ±0.12 
1.84 ±0.14 


8.19 ± 1.26 
17.72 ±2.88 


16.9 
4.6 


1.33 ±0.66 
2.46 ± 0.83 










Red 


0.0 < Zp ho t 


< 0.1 










[-22, 


-21) 


-21 i- u - 7 


18631 


1.90 ±0.14 


8.20 ±2.62 


5.97 


1.88 ±0.07 


8.14 ±0.78 


1.7 


0.96 ±0.47 


[-21, 
[-20, 


-20) 
-19) 


— 20 4 — 0,7 


83541 
45541 


1.71 ± 0.11 
1.77 ±0.16 


8.82 ±2.34 
10.41 ±3.89 


10.98 
19.29 


1.79 ±0.07 
1.85 ±0.14 


7.90 ± 0.76 
10.39 ± 1.66 


3.2 
8.1 


0.97 ±0.29 
1.15 ±0.46 


[-19, 


-17) 


-18.7"^ 


6690 


1.88 ± 0.13 


11.59 ±2.82 


2.65 


1.90 ±0.09 


11.77 ± 1.32 


1.0 


1.43 ±0.51 



Table 5. Clustering properties of luminosity-selected blue galaxies. Columns are the same as in Table [3] 



Sample 
M r - 5 log h 


Magnitude( MC ) 
M r — 5 log h 




7 


ro 

[fe- x Mpc] 




7 « 


r W 
r o 


y(d) 1 
A. v 


b/b* 
[/i" 1 Mpc] 










Blue 


0.3 < 2 phot 


< 0.4 










[-22, 


-21) 


— 91 9 — u -3 


52212 


1.71 ±0.07 


6.88 ±0.47 


0.78 


1.72 ±0.07 


6.87 ±0.38 


0.6 


1.14 ±0.12 


[-21, 


-20) 


-20.8 +0 ; 3 


79787 


1.75 ±0.06 


5.86 ±0.49 


1.52 


1.75 ±0.10 


5.83 ± 0.44 


1.3 


1.00 ±0.01 








Blue 


0.2 < z phot 


< 0.3 










[-22, 


-21) 


— 91 9 — u -3 


42517 


1.74 ±0.11 


6.42 ±0.81 


3.05 


1.75 ±0.12 


6.46 ± 0.57 


1.5 


1.17 ±0.14 


[-21, 


-20) 


-20 4 -0 ' 4 


332861 


1.63 ±0.06 


5.35 ±0.48 


4.08 


1.66 ±0.05 


5.23 ± 0.23 


2.6 


0.99 ±0.01 


[-20, 


-19) 


-i9-8j;i 


78282 


1.72 ±0.09 


5.08 ±0.47 


1.69 


1.72 ±0.09 


4.88 ± 0.34 


1.2 


0.95 ±0.11 










Blue 


0.1 < 2 phot 


< 0.2 










[-22, 


-21) 


-21 I""' 4 


12753 


1.85 ±0.13 


5.70 ±0.83 


0.86 


1.85 ±0.16 


5.67 ±0.64 


0.6 


1.22 ±0.17 


[-21, 


-20) 


-20.37_°j 


210837 


1.67 ±0.07 


4.43 ±0.32 


3.54 


1.70 ±0.06 


4.44 ± 0.25 


2.6 


0.98 ±0.35 


[-20, 


-19) 


-19 4 -0 ' 5 


571748 


1.57 ±0.08 


4.75 ±0.73 


11.72 


1.62 ±0.09 


4.45 ± 0.42 


6.9 


1.04 ±0.14 


[-19, 


-17) 


-18-61^ 


198394 


1.53 ±0.06 


4.50 ±0.49 


2.26 


1.56 ±0.06 


4.31 ± 0.23 


1.2 


1.00 ±0.10 










Blue 


0.0 < 2 phot 


< 0.1 










[-21, 


-20) 


— 20 S _U Y 
zu -°+0.9 


39246 


1.61 ±0.14 


4.84 ±0.82 


6.52 


1.65 ±0.13 


4.66 ± 0.31 


3.2 


0.97 ±0.10 


[-20, 
[-19, 


-19) 
-17) 


-19 3" ' 7 
1R I" ' 8 


109606 
264699 


1.53 ±0.06 

1.54 ±0.08 


4.63 ±0.45 
4.16 ±0.63 


2.42 
7.29 


1.57 ±0.07 

1.58 ±0.11 


4.45 ± 0.40 
3.85 ± 0.30 


2.4 
4.4 


0.94 ±0.21 
0.86 ±0.22 


[-!7> 


-14) 


-16 6 -0 ' 9 


14305 


2.02 ±0.23 


4.17 ± 1.41 


5.05 


1.99 ±0.28 


4.34 ± 1.00 


2.1 


0.82 ±0.33 



the relative bias, i.e. their clustering with respect to the L* 
sample. Our relative bias results for all samples, selected 
by photometric redshift, absolute luminosity and colour, are 
presented in Section f6. II 



5.3 Clustering of faint blue galaxies 

One of the aims of this paper is to study the clustering of 
intrinsically faint galaxies for which only photometric red- 
shifts are available in sufficient numbers to reliably calculate 
w(8). The GAM A depth and the extensive SDSS sky cov- 
erage allow us to measure the auto-correlation function of 
the faintest optically selected galaxies, i.e. with photo-z esti- 



mated absolute magnitudes in the — 1 7 < M r — 5 log h < — 1 4 
range and z p hot < 0.08. This faint sample contains a total of 
14,659 galaxies, which are mostly star-forming (as evident by 
their colours). From the subset with spectroscopic redshifts, 
the 68-central percentile of the actual absolute magnitude 
distribution covers the range —18 < M r — 51og/i < —12.7. 
However, as shown in Appendix lB4l this sample suffers from 
an overall 50 per cent contamination, with most spurious ob- 
jects arising from local, over-deblended spiral galaxies. 

The upper panel of Fig. [11] shows the correlation func- 
tions of all galaxies in our sample with z p hot < 0.08 split 
into finer luminosity bins than used previously. There exists 
a seemingly artificial steepening of w(9) on scales 9 < 0.1° 



GAMA: Galaxy clustering using photometric redshifts 15 



100 



0.01 



-18.8<M -51og ,h<-17.9 
r 10 




-17.9<M 



0.01 



0.1 



6 (degrees) 



10 



Figure 11. Angular correlation functions for the low redshift 
galaxies in our sample split in luminosity bins. The finer luminos- 
ity binning allows one to track the scales where contamination 
effects (studied and quantified in Appendix IB4II are significant. 
Error bars have been omitted for clarity. 



for galaxies with M r — 51og/i > —17. In the bottom panel 
of Fig. El we further split the -17.9 < M r - 5 log h < -14 
range into two finer luminosity bins, and again we find that 
for fainter samples, source contamination affects larger an- 
gular scales. We study this contamination and quantify it as 
a function of scale in Appendix IB4I 

Having established the angular scales over which we 
trust our w{6) measurements, we proceed to the clustering 
analysis. Using only the diagonal elements of the covariance 
matrb|3, we note that a power law describes the clustering 
signal rather well, even though there is a hint of an increase 
in the clustering strength at ~ 1 /i _1 Mpc. It is possible that 
this increase is due to blue galaxies that are satellites in 
small dark matter halos. These halos should not be dense 
enough to stop star formation and thus we observ e only blue 
galaxies in this luminosity range (|Eminian| [2008). A recent 
detailed study of the star formation history o f Ha -selected 
faint b lue galaxies in GAMA can be found in lBrough et al.l 
(|201lh . 

In conclusion, the angular clustering for the faintest 
sample has a spurious amplitude at small angular scales, 
unless one takes into account the sample contamination. We 
do this in Appendix IB4I where we visually inspect ~ 10 per 
cent of the objects in this sample and find that a signif- 
icant fraction of them are spurious, mainly due to poorly 
deblended sources. We quantify the effect of this contami- 
nation in Appendix IB4I for all luminosity bins. This investi- 
gation reveals that the angular clustering results on scales 



7 Use of diagonal covariance elements only is appropriate for this 
faint sample, as it covers a rather small volume for which JK 
resampling is unable to provide an accurate description of the 
full covariance matrix. 



< 0.1 degrees are not trustworthy enough to be considered 
reliable. We note that the power law fits are performed on 
larger scales, which we show are unaffected by this contam- 
ination. However, much more detailed investigation of the 
data is required to robustly confirm the observed increase 
in the slope of the correlation function. Finally, we note 
that we have repeated the analysis presented in this Sec- 
tion f or objects selected f rom the most recent SDSS release, 
DR8 (|Aihara et all 1201 lb . and we observe no differences in 
the results. The contamination from over-deblended spiral 
galaxies is still present in DR8 for the low luminosity bin. 



5.4 Quality of fits and the HOD formalism 

The power law fits presented in Table [3] are not all satisfac- 
tory in a quantitative sense. The angular correlation func- 
tion is only to first order well-described by a power law. 
The rather high reduced % 2 f° r some samples are either due 
to underestimated errors or due to the power law model 
being inadequate in describing the angular correlation func- 
tion over a large range of scales. From the test of Section l"3.5l 
we conclude that the JK method gives consistent errors ir- 
respective of the way we define the jackknife regions, and 
therefore it is most likely that the large reduced \ 2 values 
are more due to a limitation in the power law model rather 
than in the error estimates themselves. 

A more sophisticated model, like the halo occupation 
distri bution (HOD) model (for a review see lCoorav fc Shetbl 
2002), would provide a more physically motivated descrip- 
tion of the full correlation f unction shape, both as a function 
of co l our and luminosity ( Zehavi et all 120041 ; IZheng et all 

In^nrl \n _i. _ ! ~] 1 I Inz-i/^r-l Inm 1 1\ mi. _ TTAr\ r\ . 1 . _ 



120051 ; IZehavi et all 12003, I20T1I ) . The HOD framework, as 
shown bv lZehavi et all (|2005f ). explains the increase of clus- 
tering in the faint red population. Bright red galaxies are 
central galaxies in massive halos, whereas faint red galaxies 
are satellite galaxies in massive halos. Our measurements 
suggest that both bright and faint red galaxies are more 
strongly clustered than red galaxies with intermediate lumi- 
nosity. We also observe a bump in the angular correlation 
function of red galaxies at separations ~ 1 h~ 1 Mpc which 
signals the transition (change in slope) between the one-halo 
and two-halo term in the correlation function. On the con- 
trary, such a change in slope is not evident for the blue pop- 
ulation, hence they have a smaller \ 2 - This is also in agree- 
ment with HOD predictions, which predict a simple power 
law for blue galaxie s with luminosities M r — 51og/i < —21 
ijZehavi et alJ l2005l ). A complete HOD modelling of these 
angular clustering results with photometric redshifts is be- 
yond the scope of the present work, as this would require 
photo- 2 dedicated HOD tools to be developed as the stan- 
dard threshold samples cannot be defined. 



6 BIAS MEASUREMENTS 

6.1 Relative bias and comparison with previous 
studies 

In this paper we parametrize the real space correlation func- 
tion with a power law, and infer £(r) from angular clus- 
tering measurements via a Limber inversion. To ease com- 
parison with samples using similar, but not identical, se- 
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lection, we follow iNorberg et"all (|2002h and define the rel- 
ative bias of a class of galaxies i with respect to our L* 
(-21 < M r - 5 log ft < -20) sample as 



■ r~< — n 



(14) 



Equationfn]preserves any scale dependence for samples with 
different slopes and we choose here to estimate the relative 
bias at r = 5 hT 1 Mpc. The advantage of using this defini- 
tion of relative bias instead of the raw correlation length to 
compare with other studies is twofold. First, the former uses 
the slope as well as the correlation length, which as we know 
from equation [S] are strongly correlated. Second, if the sam- 
ple selections are just slightly different, the relative bias is 
a much more robust way of comparing them as it measures 
deviations from a series of appropriate reference samples. In 
this study this is particularly important, as photo- z inferred 
properties are not straightforwardly related to the underly- 
ing ones, as shown in Section [4.31 Our results are shown in 
Fig. [12] 

Previo us studies from both 2dFGRS (INorberg et al.l 
|2001| . |2002| ) and SDSS (|Zehavi et al1l200ll2005l , l2011l ) have 
established that the relative bias, 6/6*, as a function of rela- 
tive luminosity, L/L* , is well described by an affine relation. 
We compare our results with these studies in Fig. 1121 For all 
luminosity bins given in Table [3] we fit the equation 



b/b* = ao + aiL/L 



(15) 



where ao and ai are free parameters. Our best fit values for 
samples selected on luminosity, colour and photo-z, using 
the corresponding L* for each sample, are given in Table [5] 
The high redshift bin only provides three data points and 
thus we do not include it in this exercise (black squares in 
Fig. 1121) . I n this Table w e also compare with the bias relation 
of INorberg et all (|200lh who found (do, en) = (0.85,0.15). 
The A\ 2 between our best fit and that of Norberg et al. 
is 1.2 to 2.3, which makes the fits statistically compatible, 
as the 68% confidence inte rval for 2 degrees o f freedom cor- 
respon ds to Ax 2 = 2.31 (|Press et alj|l992l ). IZehavi et ail 
measured the bias relative to dark matter, and in 
Fig. [12] we rescale their relation with respect to L* . They 
also observed a steeper rise in relative bias at high luminosi- 
ties. Including a power of (L/L*) in our fit, we also obtain 
a steeper slope whilst \ 2 remains unchanged, despite the 
additional degree of freedom. 

For samples selected by colour as well as luminosity, it 
is more difficult to fit equation [15] in each redshift bin. For 
most photo- z bins we have four or fewer data points. More- 
over, using finer luminosity bins would worsen the statistical 
errors on N(z) and N(M r ) and thus make any fit more dif- 
ficult to interpret. Fig. 1131 shows that the blue population 
follows a similar trend to the full sample but the relative bias 
changes more smoothly as a function of luminosity. Table [6] 
gives the values of ao and ai for the colour selected sam- 
ples. We fit the same linear relation for red galaxies as well, 
despite the fact that a quadratic function would seem more 
appropriate, x 2 values for the linear fit are also shown in 
Tabel [6] and from a purely statistical point of view, a linear 
relation between b/b* and L/L* is still acceptable. Fig. [13] 
shows that the statistical uncertainty for the two faint red 
samples is quite large. This is due to the small number of 
objects in the —19 < M r — 5 log ft < —17 sample and due 




Figure 12. The relative bias, defined in equation 1141 at sepa- 
rations r = 5 h~ 1 Mpc, of all the absolute magnitude selected 
samples used in this study. Data points show the mean and er- 
rors of b/b* obtained from the distribution of 80 JK measurements 
(Sec. 13.51 1 appropriately scaled to account for the jackknife cor- 
relations. Cyan and magenta lines show our fits over the redshift 
ranges 0.2 < z p hot < 0.3 and 0.1 < z D hot < - 2 resp ectively. 
The solid black line shows the fit oflNorberg" et al] | |2001I) and the 
dotted line the fit of lZehavi et al] l|201ll l. 



Table 6. Fitted values of ao and ai in the bias-luminosity re- 
lation (equation I15H in three photo- z ranges. Column 1 lists the 
redshift bin limits, columns 2, 3 and 4 the fitted values and the 
quality of fit (reduced x 2 ) a nd column 5 li s ts Ax 2 between our 
best fit values and the fit by INorberg et all l|200ll) . 



Redshift range 


a 


ai 




A X 2 


All colours 


0.2 < 


^•phot 


< 0.3 


0.71 ±0.04 


0.25 ±0.02 


1.10 


2.32 


0.1 < 


-^phot 


< 0.2 


0.82 ±0.06 


0.24 ± 0.03 


0.14 


1.79 


0.0 < 


^•phot 


< 0.1 


0.65 ±0.05 


0.27 ± 0.06 


0.12 


1.18 


Red 


0.2 < 


^•phot 


< 0.3 


0.92 ±0.17 


0.12 ±0.07 


0.36 


0.29 


0.1 < 


-2-phot 


< 0.2 


1.28 ±0.43 


0.03 ±0.17 


2.33 


1.76 


Blue 


0.2 < 


-^-phot 


< 0.3 


0.84 ± 0.08 


0.15 ±0.06 


0.29 


0.77 


0.1 < 


•2-phot 


< 0.2 


0.98 ±0.07 


0.08 ±0.06 


0.23 


4.22 


0.0 < 


•2-phot 


< 0.1 


0.86 ±0.02 


0.08 ±0.02 


0.07 


0.02 



to the poor quality of fit for the 
sample. 



-20 < M r - 51og/i < -19 



6.2 The evolution of absolute bias for L* galaxies 



In Section 16.11 we calculated the relative galaxy bias using 
the L* sample ( — 21 < M r — 5 log h < —20) as our reference 
sample. In this Section we calculate the absolute bias of the 
L* population defined as the mean ratio of the observed 
galaxy correlation function, parametrized with a power law, 
over the non-linear dark matter theoretical correlation func- 
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Figure 13. The relative bias, defined in equation 1141 at separa- 
tions r = 5 h~ x Mpc, of all the samples used in this study split 
by colour f equation 112 D . Data points show the mean and errors 
of b/b* obtained from the distribution of 80 jackknife measure- 
ments fSec. 13.51 appropriately scaled to account for the jackknife 
correlations. Colour coding is as in Fig. 1101 



lOr 

9 




1.2- 1 



0:05 0.1 0.15 0.2 0.25 0.3 0.35 

z 

Figure 14. The evolution of clustering of L* galaxies in the local 
universe: Upper panel shows the correlation length ro; lower panel 
shows the bias b^r(z), as a function of redshift. The dashed line 
in the lower panel shows the linear theory prediction from equa- 
tion [17] Across the redshift range 0.07 < z < 0.32 the bias of L* 
galaxies agrees rather well with the linear theory model. 



tion 



b*(r) 



(dm(t) 



rT*^i5A/(r) ' 



(16) 



where 5 h 1 Mpc < r < 20 h 1 Mpc. The theoretical power 
spectrum P(k), was obtained using CAMB dLewis et al-lfe oOO) 
and the halo correction recipe of ISmith et al.l ( 20031 ). We 
then Fourier transform the non-linear P(k) to obtain the 
real space (dm (r) using the FFTLog package provided by 
IHamiltonl (|2000h . 

Since we have correlation function measurements of 
the L* population for a range of redshifts we can answer 
the question of whether the evolution of the bias can be 
described by the passive evolution model introduced by 
iTegmark fc Peebles! (|l998h : 

[b( Zl ) - l]D( Zl ) = [b(z 2 )-l]D(z 2 ), (17) 

where D is the growth of structure (|PeeblesHl980h which we 
calcul ate accurately using the growl package by IHamiltonl 
l|200lf) . which includes corrections to D(z) due to the pres- 
ence of the cosmological constant. The model described by 
equation [17] assumes that the galaxy density field linearly 
traces the dark matter density field and all clustering evolu- 
tion comes from the growth of structure in the linear regime, 
i.e. no merging. It is believed that L* galaxies have under- 
gone very little merging since z w 1 (IConselice et al.| [2009; 
lLotz et al.|[201ll ). 

In the upper panel of Fig. [14] we plot the correlation 
length as a function of redshift. ro is observed to change 
very little since z w 0.32. The lowest redshift point has larger 
errors due to the limited volume sampled. For comparisons 



with theory, it is more lucid to use the bias instead of the cor- 
relation length. In the lower panel of Fig. [I4]we plot the evo- 
lution of the absolute bias, as de fined in equation 1161 along 
with the theoretical prediction of lTegmark fe Peebles! (|l998l ) 
for passive clustering evolution (dashed line). In practice, we 
fix the high-z value of b(z) and then solve equation [T7] over 
the redshift range 0.07 < z < 0.32. We find that the evolu- 
ti on of clustering of L* galax ies is consistent with the model 
of lTegmark fc Peebles! j 1998ft . 

This agreement between the clustering of L* galax- 
ies and th e pass ive evolution model was not observed by 
iRoss et"al] (|2010ft who used SDSS photo-z's. The sample se- 
lection and th e modeling of w(9) and bias between this study 
and the one bv lRoss et al.l l|2010l ) are very different, as we use 
GAMA calibrated photo-z and model the correlation func- 
tion with a power law, whereas they used SDSS calibrated 
photo-z down to r < 21 and use halo modelling for the 
correlation function. Ideally one would expect that the two 
studies should give consistent results, but it might be that 
the aforementioned differences in the theoretical modelling 
and the sample selection influence the results significantly. 



7 DISCUSSION AND CONCLUSIONS 

Despite their inherent limitations, photometric redshifts of- 
fer the opportunity to study the clustering of various galaxy 
populations using large numbers of objects over a wide range 
of angular scales with improved statistics, with the caveat 
that their systematic uncertainties are significantly more 
complex to deal with. In this section we summarize and dis- 
cuss the main implications of our results. 

Using GAMA spectroscopic redshifts as a training set, 
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we have compiled a photometric redshift catalogue for the 
SDSS DR7 imaging catalogue with r po tro < 19-4. We carried 
out extensive tests to check the robustness of the photo- z es- 
timates and use them for calculating r-band absolute lumi- 
nosities. We split our sample of 4,289,223 galaxies into sam- 
ples selected on photometric redshift, colour and luminosity 
and estimate their two point angular correlation functions. 
Redshift distributions for the Limber inversion are calcu- 
lated using Monte-Carlo resampling, which we show are very 
reliable. 

Our clustering r esults are in agreemen t wi th other clus- 
tering studies such as lNorberg et al.l l|2002h and lZehavi et all 
|201ll ) who used spectroscopic redshifts. We extend the anal- 
ysis to faint galaxies where photo-zs allow us to obtain rep- 
resentative numbers for clustering statistics. We find that 
the correlation length decreases almost monotonically to- 
ward fainter absolute magnitudes and that the linear re- 
lation between b/b* and L/L* holds down to luminosities 
L ~ 0.03L*. For the L* population we observe a bias evo- 
lut ion consistent with the pas sive evolution model proposed 

bv lTegmark fc Peebles! (ll998Tl. 

As shown by others (iNorberg et al.|[2002; lllogg et al 



20031 : IZehavi et al. 2005; Swanson et alJl2008a 



Zehavi et al 



20111 ') and confirmed here, the colour dependence is more 
intriguing because faint red galaxies exhibit a larger cor- 
relation length than red galaxies at intermediate luminosi- 
ties. This trend is e xplained by HOD models, as shown by 
IZehavi et alj ((2005). Clustering for blue galaxies depends 
much more weakly on luminosity. We find that at faint mag- 
nitudes the SDSS imaging catalogue is badly contaminated 
by shreds of over-deblended spiral galaxies, which makes the 
interpretation of the clustering measurements difficult. We 
determine an angular scale beyond which our results are not 
affected by this contamination, and test this by modelling 
the scale-dependance of the contamination as well as study- 
ing its luminosity dependence. 

The use of photometric redshifts is likely to dominate 
galaxy clustering studies in the future. A number of assump- 
tions made in this work might need to be reviewed when we 
have even better imaging data and training sets. In par- 
ticular, for cosmology, the non-Gaussianity of photo- z and 
robust reconstruction of redshift distributions will become a 
very pressing issue. For galaxy evolution studies, it is essen- 
tial to study the mapping between a photo-z derived lumi- 
nosity range and the true underlying one, as HOD modelling 
of the galaxy two point correlation function relies heavily 
on the luminosity range considered. In this paper, we re- 
port only qualitative agreement and leave any HOD study 
using these photometric redshift inferred clustering results 
to future work. 
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APPENDIX A: SDSS SQL QUERY 

The SQL query used to extract our sample from the SDSS 
DR7 database. 

SELECT 

objid, g.ra, g.dec, flags, petror50_r, 
petror50Err_r , petror90_r, petror90Err_r , 
petroMag_r - extinct ion_r as petroMagCor_r , 
petroMagErr_r , 

modelMag_u - extinction_u as modelMagCor_u, 
modelMag_g - extinction_g as modelMagCor_g, 
modelMag_r - extinction_r as modelMagCor_r , 
modelMag_i - extinction_i as modelMagCor_i , 
modelMag_z - extinction_z as modelMagCor_z , 
modelMagErr_u , modelMagErr_g, modelMagErr_r , 
modelMagErr_i , 
modelMagErr_z 
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Figure Bl. Angular correlation functions of the r-band apparent 
magnitude bins defined in Table fBll 



FROM galaxy g 

JOIN Frame f on g.fieldID = f.fieldID 
WHERE 

zoom = and stripe between 9 and 44 
and psf mag_r - modelmag_r > . 25 and 
petromag_r - extinction_r < 19.4 

AND ((flags_r & 0x10000000) != 0) 

AND ((flags_r k 0x8100000c00a0) = 0) 

PSF_FLUX_INTERP, SATURATED , 

AND (((flags.r & 0x400000000000) = 0) or 

(psfmagerr_r <= 0.2)) 

AND (((flags.r & 0x100000000000) = 0) or 
(flags_r & 0x1000) = 0) 



APPENDIX B: TESTS FOR SYSTEM ATICS 

Clustering studies using photometric redshifts are subject 
to systematic errors which become more pressing as the sta- 
tistical errors are significantly decreased. In this Appendix 
we study the most relevant sources of systematic errors that 
might affect our results. A similar study, for a brighter sam- 
ple of galaxies at h igher redsh i fts (0.4 < z < 0.7) was re- 
cently presented bv lRoss et al.l l|2011al ). 

Here we present tests that we believe are more likely 
to affect the results shown in this paper. We start in Ap- 
pendix IB1I with a scaling test, which mostly tests the re- 
liability of the whole sample for clustering studies. In Ap- 
pendix [B2] we quantify the possible systematics in the clus- 
tering signal due to spurious cross-correlations of different 
photometric redshift bins. In Appendix IB3 1 we test for pos- 
sible systematics in the spatial correlation function intro- 
duced by the redshift distributions used in Limber's equa- 
tion. Lastly, in Appendix IB4I we examine the robustness of 
the correlation function of the faintest luminosity bin. 



Bl Scaling test 

With a photometric sample of this size it is prudent to per- 
form a scaling test in order to uncover any dependence of 
clustering on apparent magnitude. In order to do this we 



Table Bl. Clustering properties in apparent magnitude bins de- 
fined by r-band Petrosian magnitude. Column 1 lists magnitude 
range, column 2 the number of galaxies, columns 3 and 4 give 
the values of 7 and rrj , denned in equation \E\ Column 5 lists the 
quality of the power law fits. Errors were calculated using the full 
covariance matrix, but we don't include the N(z) uncertainty. 



r-bin (mags) 


N 9 


7 


ro 


xi 


12.0 < r < 16.0 


79543 


1.81 ± 0.03 


5.01 ± 0.48 


1.01 


16.0 < r < 17.0 


201805 


1.72 ± 0.02 


5.76 ±0.31 


3.1 


17.0 < r < 18.0 


671315 


1.73 ±0.01 


5.62 ± 0.20 


3.38 


18.0 < r < 18.5 


768620 


1.74 ±0.01 


5.58 ±0.17 


2.28 


18.5 < r < 19.0 


1336411 


1.73 ±0.01 


5.50 ±0.12 


2.55 


19.0 < r < 19.4 


f720930 


1.71 ± 0.01 


5.20 ±0.12 


3.48 



split our sample in apparent magnitude bins and then cal- 
culate the angular correlation function. The apparent mag- 
nitude ranges are given in Table IBTI The angular correlation 
functions are shown in Fig. IB1I For all apparent magnitude 
bins the slope is approximately equal, but the amplitude 
varies as expected, shifting from high to low values as we 
go fainter. We then use equation [6] to calculate the corre- 
lation length for each magnitude range. We fit over scales 
of 0.01 < 6 < 2 degrees (0.02 < 9 < 1.2 degrees for the 
12 < r < 16 sample). The correlation length for each mag- 
nitude bin is found to be equal with in the error bars and i n 
agreement with the earlier study of iBudavari et all {2003) . 
Thus, for all well populated apparent magn itude bins we 
recover the fiducial power law (|Peeblesj|l980h 



5 ft-'Mpc 



(Bl) 



B2 Cross correlation of photometric redshift cells 

A crucial consistency check, necessary for the validation of 
our results, is the study of the induced cross correlations 
between redshift shells defined by photo-zs from our sample. 
Since we have established that a z « 0.04 we start from 
Zphoto = and use five continuous slices with Az — 0.08, in 
order to allow all galaxies with photo- 2 error of < 2cr to be 
included in the correct redshift bin. We then cross-correlate 
slices which are more than one Az apart. 

If a Gaussian with a — 0.04 provides good approxi- 
mation of the error <r 2 , then we can estimate what frac- 
tion of galaxies should lie outside the width of each photo-z 
slice. A galaxy which is outside its redshift slice with width 
Az — 0.08 will have an error greater than 2a. For a Gaussian 
distribution ~ 5 per cent of all galaxies should lie outside 
their redshift boundaries. Therefore their residual contribu- 
tion to the cross correlation should be ~ 10 per cent of 
their auto-correlatiorjf]. In Fig. IB2I we present three auto- 
correlation functions and their respective cross-correlations. 
The cross-correlation functions from Fig. IB2l are not entirely 
consistent with zero, but on all scales the residual signal is 
of the expected order of magnitude. Fig. IB2I demonstrates 
that ANNz does not produce spurious correlations between 
physically disjoint galaxies. 



8 Assuming that the two auto-correlations are equal and the num- 
ber of galaxies in each sam ple is equal as well. F or a detailed 
treatment of these effects see Bcni amin et al. 
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Figure B2. Auto-correlation (diamonds and circles) and cross-correlation (squares) functions for photo-z bins. The cross-correlation 
signals have negligible magnitude compared with the auto-correlations, and for angular separations ^ 0.01 degrees are consistent with 
zero. The errors are calculated using JK resampling as explained in Section 13.51 



B3 Testing dN/dz 

Here we test the accuracy of our recovered dN/dz distri- 
bution by studying angular clustering in the GAMA area. 
Since we have precise knowledge of the spectroscopic red- 
shift distributions in the GAMA area, we use these angular 
clustering measurements to test the robustness of our spa- 
tial clustering results using different methods of recovering 
dN/dz. The methods that we test against the given GAMA 



spectroscopic redshift distributions are (i) Monte-Carlo re- 
sampling of the photo-z distributions, assuming Gaussian 
errors (equation ll3l) . which has been used for all the results 
in this paper, and (ii) the weighting method of lCunha et al.l 
(2009) (also known as nearest neighbour method). 

The latter method can be summed up in three distinct 
steps. First, one estimates the distance in apparent magni- 
tude space to the 200th nearest neighbour of each object 
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in the spectroscopic set, using a Euclidean metric. The ex- 
act ordinal number of the neighbouring object should not 
change the result significantly. For the GAMA number den- 
sity, N — 200 is the best trade-off between smoothing out 
the large scale structure while at the same time preserving 
the locality of the photometric information. Second, one cal- 
culates the number of objects in the photometric set that are 
within the hypervolume defined by this distance and then 
one calculates the weight of each object in the spectroscopic 
set at point m% according to the equation 



0.05 r 



N(mi) phot 



A'phot.tot N(mi 



(B2) 



where N(rrii) Bpcc = 200. In the third step, the already known 
spectroscopic distribution is weighted to match the distri- 
bution of the photometric sample. The weighting is done by 
summing the weights Wi of each object in the spectroscopic 
sample for all redshift ranges: 



N(z)„ ci = 2^ "WiN(zi < Zi < Z2)s 



(B3) 



ICunha et all (2009) show that this method is superior in 
recovering the true dN/dz to other methods using photo-zs, 
but they do not include the Monte-Carlo resampling in their 
comparisons. 

The comparison of the different methods is depicted in 
Fig. IB31 where all the clustering measurements are confined 
to the GAMA area. The errors for the angular clustering 
measurements are assumed to be Poisson, which is just a 
lower bound, and the errors on the redshift distributions are 
obtained from the scatter of Monte-Carlo simulations. This 
test is performed for the same luminosity bins as in Sec- 
tion 14.21 apart from the brightest and faintest bins which 
have a very small number of galaxies and hence large statis- 
tical errors on w(0). 

The (a priori required) agreement between the ro mea- 
surements from the different methods of recovering dN/dz 
is not perfect. The ro measurements are not significantly af- 
fected by the differences between the redshift distributions 
of Fig. [(J] In conclusion, Fig. IB3I for the three intermedi- 
ate and well populated luminosity bins, implies that the re- 
construction of the underlying redshift distribution is not 
introducing any systematic errors in the ro measurements. 

This comparison does have its limitations. Samples with 
small numbers of objects are sensitive to number variations 
due to the different selections of the two surveys (mainly 
the more conservative star-galaxy separation that we use in 
this paper). Moreover, it is very difficult to get realistic er- 
ror bars for samples with a small number of galaxies and for 
which the survey's angular extent is comparable with the 
angular scales used for the w(9) measurements. The diffi- 
culty in getting the exact angular clustering signal is shown 
in the upper panel of Fig. IB3I which shows the residuals of 
the measured slopes for the GAMA and SDSS samples. In 
spite of these, Monte-Carlo resampling seems to recover the 
true ro slightly better than the weighting method. 

B4 Correlation function for faint galaxies 

The correlation function of the faintest sample [—17, —14) 
exhibits an infeasibly large clustering amplitude at small 
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Figure B3. Upper panel: Slope residual of the correlation func- 
tion measurements in the GAMA area, using the measurement 
of the GAMA sample with spectroscopic redshifts as a reference 
(A7 = 7(SDSS) - 7(GAMA)). Lower panel: Comparison of the 
effect of the various redshift distributions (as shown in Fig. [SJ on 
ro measurements again using the GAMA sample as a reference 
(Aro = ro(i) — ro(GAMA)). Following the discussion in Sec. 15.11 
the error bars show the combined effect of the power law fit un- 
certainties (assumed to be Poisson), which are independent of the 
underlying dN/dz, and the scatter in ro due to 100 Monte-Carlo 
resamplings of each dN/dz (only (dN/dz) Bpcc is known precisely). 



scales (Fig. IB4p . This increase in the clustering signal is not 
hinted at in the — 19 < M r — 51ogh, < — 17 luminosity bin, 
and so we here investigate whether there is some sort of 
contamination in the faintest sample. 

We randomly select ~ 10 per cent of the objects in the 
faintest luminosity bin and we visually inspect them to see 
if they are genuine galaxies. The fraction of spurious ob- 
jects is shown in the left panel of Fig. IB5I and we observe 
that it is significant at the very faint end, where the ac- 
tual number of galaxies is low (red line in the same figure), 
and ~ 40 per cent at the bright end of that luminosity bin. 
From our visual inspection, most spurious objects are local, 
over-deblended spiral galaxies, the remainder are merging 
systems or just sky noise. Evidently as we go fainter, the 
contamination level is increasing and this presents a serious 
drawback for clustering studies and a serious limitation for 
large surveys. 

The right panel of Fig. IB5I shows the fraction of spuri- 
ous objects in the other five absolute magnitude bins. We 
visually inspected ~ 100 objects from each of those bins and 
we found that the contamination level is much lower, with 
a slight increase toward the bright and faint ends. Our de- 
tailed study of the correlation function of the faintest bin 
shows that it is not affected by contamination on the scales 
of primary interest (6 > 0.1°), something which we expect 
to hold true for all other luminosity bins, which have a sig- 
nificantly smaller fraction of spurious objects. 

The contamination in the —17 < M r — 51og/i < —14 
luminosity bin affects the two point correlation function dif- 
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Figure B4. Two point correlation function of the faintest lu- 
minosity bin (—17 < M r — 51og/i < —14). Black circles show 
the total correlation function, blue squares show the correlation 
function of the ~ 10 per cent subset of objects visually inspected, 
green stars show the correlation function of the "clean" part of the 
inspected subset, red diamonds show the total correlation func- 
tion corrected to account for the spurious pairs on scales > 0.1 de- 
grees and finally, cyan triangles show the w{8) measurement using 
only GAMA spectroscopic data. Errors bars for the total sample 
are calculated using the JK method. Open symbols represent an- 
gular scales at which the signal is significantly contaminated and 
so cannot be trusted. 




Figure B5. Left panel: Black symbols show the fraction of spuri- 
ous objects for the faintest luminosity bin as a function of absolute 
luminosity. These fractions are estimated by visually inspecting 
~ 10 per cent of the total number of objects in that bin. Red 
symbols show the overall distribution of objects as a function of 
absolute magnitude. Right panel: Fraction of spurious objects as a 
function of absolute luminosity, obtained by visually inspecting a 
small subset (~ 100) of all objects in each luminosity bin. In both 
panels the error bars are obtained assuming Poisson statistics. 

L* in Fig. IB5I (since that population dominates) , we would 
have significantly underestimated the number of spurious 
objects. 



ferently at different angular scales. We address this issue by 
counting the number of pairs of genuine galaxies in the vi- 
sually inspected subset. The results are shown in Fig. [H 
where we also include the angular correlation function from 
the corresponding sample from GAMA0 Due to the fact 
that the subset has a weakened signal at very small scales 
we can only draw conclusions for angular scales > 0.1 de- 
grees. From Fig. IB4I we see that at these scales the contam- 
ination does not significantly affect the correlation function 
and its fit parameters 7 and ro. For this reason, we present 
our results limited to angular scales 9 > 0.1°. 

We also repeated our anal ysis after masking out areas 
of sky covered by R C3 galaxies (|de Vaucouleurs et al.lll99ll : 
ICorwin et alH l994) to test whether we could decrease the 
contamination level. We did not observe any qualitative dif- 
ferences in the power law parameters estimated, and more 
importantly, the amplitude of w(9) at small scales did not 
reduce, indicating that the RC3 catalogue does not capture 
all over-deblended galaxies in the SDSS galaxy catalogue. 

Finally, it is important to note (and caution) that 
the source contamination due to over-deblending only be- 
came apparent when interpreting the bottom right panels of 
Figs. 151 and llOp . Had we completely trusted the results of the 
scaling test (Appendix IB 1[) or used only the data point near 



9 GAMA objects have been visually inspected and are therefore 
more reliable than objects in the SDSS imaging catalogue. On 
the other hand, GAMA has a smaller area, which increases the 
statistical errors. For this sample, considering Poisson errors only, 
the statistical errors on w(9) would be at least three times larger 
than the ones obtained from the SDSS sample. 



